[fix](fe) Fix Paimon JDBC driver registration for JNI scans#61513
[fix](fe) Fix Paimon JDBC driver registration for JNI scans#61513xylaaaaa wants to merge 4 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
Fixes Paimon JDBC catalog system table queries executed via the BE JNI scanner by ensuring the JDBC driver JAR/class metadata is propagated from FE to BE and registered in a DriverManager-visible classloader before Paimon table initialization.
Changes:
- Add FE-side propagation of JDBC driver URL/class into scan-range Paimon options for BE execution.
- Register JDBC drivers in BE (JNI scanner) using a shared
JdbcDriverUtilshelper andJniScannerClassLoaderURL injection. - Extend/introduce tests: FE unit tests, BE Java extension unit test, and broader regression coverage for Paimon JDBC system tables (including
row_tracking).
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/external_table_p0/paimon/test_paimon_jdbc_catalog.groovy | Extends regression to validate all supported Paimon JDBC system tables are readable and adds a row-tracking case. |
| fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStorePropertiesTest.java | Adds UT verifying backend options include full driver URL and driver class. |
| fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/source/PaimonScanNodeTest.java | Adds UT verifying PaimonScanNode extracts backend Paimon options for JDBC catalogs. |
| fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java | Adds getBackendPaimonOptions() to expose JDBC driver metadata for BE scans. |
| fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java | Propagates backend Paimon options into TPaimonFileDesc for JNI scan ranges. |
| fe/be-java-extensions/paimon-scanner/src/test/java/org/apache/doris/paimon/PaimonJdbcDriverUtilsTest.java | New unit test validating driver registration path for the JNI scanner classloader. |
| fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJniScanner.java | Registers the JDBC driver (when configured) before initializing Paimon table/reader. |
| fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJdbcDriverUtils.java | Adds Paimon-specific wrapper to register JDBC drivers based on params. |
| fe/be-java-extensions/java-common/src/main/java/org/apache/doris/common/jdbc/JdbcDriverUtils.java | Introduces shared driver registration utility (URLClassLoader/JniScannerClassLoader-aware). |
| fe/be-java-extensions/java-common/src/main/java/org/apache/doris/common/classloader/JniScannerClassLoader.java | Adds addURLIfAbsent to support safe dynamic URL injection. |
| be/src/gen_cpp/CMakeLists.txt | Excludes generated Thrift skeleton sources from compilation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return Collections.emptyMap(); | ||
| } | ||
| if (StringUtils.isBlank(driverClass)) { | ||
| throw new IllegalStateException("jdbc.driver_class or paimon.jdbc.driver_class is required when " |
| Map<String, String> params = new HashMap<>(); | ||
| params.put(PaimonJdbcDriverUtils.PAIMON_JDBC_DRIVER_URL, driverJar.toUri().toURL().toString()); | ||
| params.put(PaimonJdbcDriverUtils.PAIMON_JDBC_DRIVER_CLASS, DummyJdbcDriver.class.getName()); | ||
|
|
||
| JniScannerClassLoader scannerClassLoader = new JniScannerClassLoader("paimon-test", List.of(), null); | ||
| PaimonJdbcDriverUtils.registerDriverIfNeeded(params, scannerClassLoader); | ||
|
|
||
| Driver driver = DriverManager.getDriver("jdbc:dummy:test"); | ||
| Assert.assertTrue(driver.acceptsURL("jdbc:dummy:test")); |
| private Path createDriverJar() throws IOException { | ||
| Path jarPath = Files.createTempFile("paimon-jdbc-driver", ".jar"); | ||
| String resourceName = DummyJdbcDriver.class.getName().replace('.', '/') + ".class"; | ||
| try (JarOutputStream jarOutputStream = new JarOutputStream(Files.newOutputStream(jarPath)); | ||
| InputStream inputStream = DummyJdbcDriver.class.getClassLoader().getResourceAsStream(resourceName)) { | ||
| Assert.assertNotNull(inputStream); | ||
| jarOutputStream.putNextEntry(new JarEntry(resourceName)); | ||
| byte[] buffer = new byte[4096]; | ||
| int bytesRead; | ||
| while ((bytesRead = inputStream.read(buffer)) >= 0) { | ||
| jarOutputStream.write(buffer, 0, bytesRead); | ||
| } | ||
| jarOutputStream.closeEntry(); | ||
| } | ||
| return jarPath; | ||
| } |
TPC-H: Total hot run time: 26912 ms |
TPC-DS: Total hot run time: 168467 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Code Review Summary
This PR fixes Paimon JDBC system table queries that run through the BE JNI scanner by propagating JDBC driver metadata from FE to BE and registering the driver before Paimon table initialization. The approach is correct and the end-to-end data flow works: FE puts jdbc.driver_url and jdbc.driver_class into paimon_options → BE C++ adds paimon. prefix → Java scanner reads paimon.jdbc.driver_url / paimon.jdbc.driver_class → driver registration happens before initTable().
Critical Checkpoint Conclusions
- Goal accomplished? Yes. The code correctly propagates JDBC driver info from FE → BE thrift → JNI scanner, and registers the driver before Paimon table initialization, fixing the
No suitable driver founderror. - Test coverage? Adequate. Unit tests cover the FE
getBackendPaimonOptions(), the BE-sidePaimonJdbcDriverUtils.registerDriverIfNeeded()(including error cases), and the regression test covers all JDBC system tables. - Concurrency?
JniScannerClassLoader.addURLIfAbsent()issynchronized,REGISTERED_DRIVER_KEYSusesConcurrentHashMap.newKeySet(), andDRIVER_CLASS_LOADER_CACHEusesConcurrentHashMap.computeIfAbsent(). The race betweenREGISTERED_DRIVER_KEYS.add()and the subsequent registration inJdbcDriverUtils.registerDriver()is acceptable — worst case is a redundantDriverManager.registerDriver()call, which is idempotent. - Lifecycle / memory?
DRIVER_CLASS_LOADER_CACHEentries are never evicted, which is acceptable since the number of distinct JDBC driver URLs is bounded and small. - Incompatible changes? None. Uses the existing
paimon_optionsthrift field, no new thrift fields added. - Parallel code paths? The
PaimonCppReader(native C++ reader) also readspaimon_optionsbut only for file format hints, not driver registration — this is correct since native reader doesn't need JDBC drivers. - Configuration items? No new configs added.
- FE-BE variable passing? Properly uses the existing
paimon_optionsmap inTPaimonFileDesc.
Issues Found
See inline comments for details. Summary of issues:
- Unrelated change:
be/src/gen_cpp/CMakeLists.txtmodification is unrelated to the Paimon JDBC driver fix and should be in a separate PR. - Per-split redundancy:
paimon_optionsis set identically for every split, while the codebase convention (as seen withpaimon_predicateand Hadoop conf) is to set scan-node-wide data at the scan-node level. This adds unnecessary serialization overhead for many-split queries. - Code duplication opportunity:
DriverShimis now triplicated across the codebase (FE Paimon, FE Iceberg, BE java-extensions). The newJdbcDriverUtilswas created as a shared utility but the FE-side copies were not consolidated.
| ${GEN_CPP_DIR}/*.cc | ||
| ) | ||
| list(FILTER SRC_FILES EXCLUDE REGEX "_server\\.skeleton\\.cpp$") | ||
|
|
There was a problem hiding this comment.
This change (skeleton filter + trailing blank line removal) is unrelated to the Paimon JDBC driver fix. It was introduced by a different commit (JuiceFS compatibility). Per the commit standards in AGENTS.md: "Files in git commit should only be related to the current modification task." This should be in a separate PR or excluded from this one.
| } | ||
| fileDesc.setFileFormat(fileFormat); | ||
| if (!backendPaimonOptions.isEmpty()) { | ||
| fileDesc.setPaimonOptions(backendPaimonOptions); |
There was a problem hiding this comment.
Per-split redundancy: backendPaimonOptions contains the same JDBC driver URL and class for every split, yet it's set per-split in TPaimonFileDesc.paimon_options. This is inconsistent with the existing pattern — see the comment at line 267-268 ("Hadoop conf is set at ScanNode level via params.properties in createScanRangeLocations(), no need to set it for each split to avoid redundant configuration") and line 205-207 where paimon_predicate is also set at the scan-node level.
For queries with many splits, this means the same two key-value pairs are serialized redundantly in every TFileRangeDesc. Consider setting these options at the scan-node level (e.g., via params.properties or a dedicated field in TFileScanRangeParams) instead.
| } | ||
|
|
||
| private static final class DriverShim implements Driver { | ||
| private final Driver delegate; |
There was a problem hiding this comment.
Code duplication: This DriverShim class is the third identical copy in the codebase — the other two are:
PaimonJdbcMetaStoreProperties.DriverShim(FE side, line 222)IcebergJdbcMetaStoreProperties.DriverShim(FE side, line 191)
Since JdbcDriverUtils was explicitly created as a shared utility class, consider also consolidating the FE-side copies to reduce maintenance burden. (The FE copies live in a different classloader context, but the DriverShim class itself is trivially shareable — it's a pure delegation wrapper with no classloader-specific behavior.)
This is a minor improvement suggestion, not a blocking issue.
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Paimon JDBC system tables that execute through the BE JNI scanner could not initialize the PostgreSQL JDBC driver, so queries such as snapshots/manifests/partitions failed with 'No suitable driver found'. This change propagates JDBC driver metadata to BE and registers the driver in a DriverManager-visible classloader before Paimon table initialization.
### Release note
Fix Paimon JDBC catalog system table queries that run through the BE JNI scanner.
### Check List (For Author)
- Test: FE unit test / Java extension unit test / Manual test
- FE unit test: ./run-fe-ut.sh --run org.apache.doris.datasource.property.metastore.PaimonJdbcMetaStorePropertiesTest,org.apache.doris.datasource.paimon.source.PaimonScanNodeTest
- Java extension unit test: env JAVA_HOME=/mnt/disk1/chenjunwei/doris_tools/jdk-17.0.9+9 /mnt/disk1/chenjunwei/doris_tools/apache-maven-3.9.9/bin/mvn -pl be-java-extensions/paimon-scanner -am test -Dtest=PaimonJdbcDriverUtilsTest -DfailIfNoTests=false
- Manual test: Paimon JDBC catalog system tables on local FE/BE cluster
- Behavior changed: Yes (Paimon JDBC system tables now work through BE JNI scans)
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: The Paimon JDBC regression case only verified schemas and snapshots, which left the other system tables and the row_tracking table unexercised. Extend the regression case to cover every system table exposed through table$system_name syntax, including a separate row-tracking-enabled table.
### Release note
None
### Check List (For Author)
- Test: Manual test
- Manual test: Verified all Paimon JDBC system tables on the local FE/BE cluster, including row_tracking on a row-tracking-enabled table
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#61513 Problem Summary: Address review feedback for Paimon JDBC JNI scans by moving backend Paimon options to scan-level params with BE fallback, aligning driver_class validation exceptions, cleaning up JDBC driver test state, and removing the unrelated build change from the PR branch history. ### Release note None ### Check List (For Author) - Test: Unit Test / Static check - git diff --check on the staged files - ./run-fe-ut.sh --run org.apache.doris.datasource.property.metastore.PaimonJdbcMetaStorePropertiesTest,org.apache.doris.datasource.paimon.source.PaimonScanNodeTest,org.apache.doris.paimon.PaimonJdbcDriverUtilsTest (still running in the PR worktree at commit time) - Behavior changed: Yes (Paimon backend options now propagate at scan level with BE fallback; validation exception type is aligned) - Does this need documentation: No
87b2ee4 to
a0ed403
Compare
### What problem does this PR solve? Issue Number: None Related PR: apache#61513 Problem Summary: Adjust the Paimon C++ reader conditional formatting so the updated option fallback logic matches the repository style and does not trip format checks. ### Release note None ### Check List (For Author) - Test: No need to test (format-only change); verified with `git diff --check -- be/src/format/table/paimon_cpp_reader.cpp` - Behavior changed: No - Does this need documentation: No
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 26994 ms |
TPC-DS: Total hot run time: 170657 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Paimon JDBC system tables that execute through the BE JNI scanner could not initialize the PostgreSQL JDBC driver, so queries such as snapshots/manifests/partitions failed with
No suitable driver found. This change propagates JDBC driver metadata to BE and registers the driver in a DriverManager-visible classloader before Paimon table initialization. The regression case is also extended to cover all supported Paimon JDBC system tables, includingrow_trackingon a row-tracking-enabled table.Release note
Fix Paimon JDBC catalog system table queries that run through the BE JNI scanner.
Check List (For Author)
Manual test:
./run-fe-ut.sh --run org.apache.doris.datasource.property.metastore.PaimonJdbcMetaStorePropertiesTest,org.apache.doris.datasource.paimon.source.PaimonScanNodeTestenv JAVA_HOME=/mnt/disk1/chenjunwei/doris_tools/jdk-17.0.9+9 /mnt/disk1/chenjunwei/doris_tools/apache-maven-3.9.9/bin/mvn -pl be-java-extensions/paimon-scanner -am test -Dtest=PaimonJdbcDriverUtilsTest -DfailIfNoTests=falseVerified all Paimon JDBC system tables on the local FE/BE cluster, including
row_trackingon a row-tracking-enabled tableBehavior changed:
Does this need documentation?